AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
RLHF optimization

# RLHF optimization

Fsfairx Gemma2 RM V0.1
A reward model based on the Gemma-2-9B architecture, trained using RLHF workflow, suitable for dialogue and reasoning tasks.
Large Language Model Transformers
F
sfairXC
51
7
Norgpt 3B Rfhl Summarization
A text summarization model fine-tuned on Norwegian news summarization datasets using RLHF strategy based on NorGPT-3B model
Text Generation Transformers Other
N
NorGLM
56
0
Distilroberta Base Rejection V1
Apache-2.0
A text classification model fine-tuned based on distilroberta-base, used to identify rejection responses generated by large language models
Text Classification Transformers English
D
protectai
74.91k
7
Starling LM 7B Alpha
Apache-2.0
The first open-source large language model trained with AI Feedback Reinforcement Learning (RLAIF), demonstrating excellent performance in MT Bench tests
Large Language Model Transformers English
S
berkeley-nest
9,765
558
Xwin LM 70B V0.1
Xwin-LM is a powerful language model based on Llama2, specializing in large language model alignment techniques, with outstanding performance on the AlpacaEval benchmark.
Large Language Model Transformers
X
Xwin-LM
1,161
214
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase